in silico Plants
◐ Oxford University Press (OUP)
Preprints posted in the last 30 days, ranked by how well they match in silico Plants's content profile, based on 24 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Bauget, F.; Ndour, A.; Boursiac, Y.; Maurel, C.; Laplaze, L.; Lucas, M.; Pradal, C.
Show abstract
Drought is a significant factor in agricultural losses, making it imperative to understand how root system architecture (RSA) adapts to environmental condition like water deficit. HydroRoot is a functional-structural plant model (FSPM) aimed at analyzing and simulating hydraulic and solute transport of RSA. The model integrates a static hydraulic solver, a coupled water-solute transport solver, a statistical generator of RSA based on Markov model, and a dynamic hydraulic model accounting for root growth. This paper presents the model, the mathematical description of the formalism of solvers, and use cases with their associated tutorials. Five use cases illustrate capabilities of HydroRoot, which has been successfully used for phenotyping root hydraulics across various species, including Arabidopsis, maize, and millet. The model-driven phenotyping method "cut and flow" is presented to characterize axial and radial conductivities on a given root genotype. Finally, three step-by-step tutorials provide a structured way to learn how to use HydroRoot 1) to simulate hydraulic on a given architecture, 2) to simulate water and solute transport on a maize root, and 3) to simulate hydraulic on two pearl millet genotypes with varying soil conditions. Hydroroot is an open-source package of the OpenAlea platform, with the code publicly available on Github. A comprehensive documentation is available with a reproducible gallery of examples.
Kottelenberg, D. B.; Morales, A.; Anten, N. P. R.; Bastiaans, L.; Evers, J. B.
Show abstract
In cereal-legume intercrops, weed suppression is primarily driven by cereals, whose competitiveness is shaped by trait plasticity--morphological adjustments in response to the intercrop environment. However, how individual cereal traits respond plastically and contribute to system performance remains unclear, hampering improvements through breeding or system design. We combined field experiments with functional-structural plant modelling to quantify plastic responses of four cereal traits (tiller number, tiller angle, specific leaf area (SLA), and specific internode length (SIL)) and their effects on weed suppression and crop productivity. Field measurements revealed plasticity in tiller number, tiller angle, and SIL between sole crops and intercrops, while SLA showed minimal differences. Simulations showed that intermediate tiller numbers resulted in the strongest weed suppression and highest productivity, indicating an optimum, while more horizontal tillers suppressed weeds slightly better than vertical ones. Weed suppression increased with higher SLA values, while SIL showed a saturating response, increasing to intermediate SIL values and plateauing thereafter. In simulations with short-statured cereal phenotypes (low SIL), the reduction in cereal weed suppression was compensated by the legume component. This study demonstrates how FSP modelling can be used to investigate trait plasticity mechanisms and generate testable hypotheses about trait effects in complex intercrop systems. HighlightCereal trait plasticity shapes weed suppression in cereal-legume intercrops, with distinct response patterns per trait, while legumes can compensate for weakly competitive cereals, suggesting balanced competition over cereal dominance.
Salomon, J.; Enjalbert, J.; Flutre, T.
Show abstract
The genetics of interspecific groups remains largely unexplored, despite the central role of social (or indirect) genetic effects in shaping phenotypic expression within communities. Intercropping, i.e. the simultaneous cultivation of multiple crop species in the same field, offers a powerful model to harness these interspecific social effects. Such species mixtures provide well-documented agricultural benefits, yet few breeding frameworks have integrated the genetics of social interactions. Here, we address this gap by extending quantitative genetic theory to interspecific groups, with intercropping as a concrete and applied model case. We propose a quantitative genetic model that jointly analyzes intra and interspecific interactions within a unifying framework. Breeding values are decomposed into a direct component, shared in mono and mixed-crops, an interspecific social component corresponding to the effect of one species on another, and an intraspecific component that captures the social effects within a mono-genotypic stand of cloned plants. Statistically, this consists in simultaneously fitting several linear mixed models, one per stand type, all having direct breeding values in common. As no open-source software can fit such a complex mixed model, we provide such an implementation in R/C++. Simulations across various genetic (co)variance structures and sparse experimental designs showed accurate estimation of all genetic (co)variances and breeding values. With an incomplete, yet balanced design combining sole crops and intercrops, genetic gains in both systems were achievable simultaneously, enabling breeding strategies that progressively integrate intercropping into existing, sole-crop-only schemes. More broadly, this framework allows dissecting direct and social genetic effects when genotypes are observed in mono- and mixed-species situations, cultivated or not.
Monyak, T.; Morris, G.
Show abstract
Global networks of crop breeding programs leverage diverse germplasm, but diversity increases the complexity of maintaining stability in their elite genepools. To characterize genetic heterogeneity in breeding metapopulations and develop insights on how to manage it, we simulated the evolution of breeding populations on fitness landscapes. We revealed the geometric decrease in the average effect size of alleles segregating as standing variation that become fixed along an adaptive walk. We also demonstrated how independent adaptive walks of subpopulations are influenced by genetic drift, leading to cryptic genetic heterogeneity among elite genepools. This variation is released when elite lines derived from independent subpopulations are crossed, leading to segregation for 2-4X more major QTL in admixed families as in unadmixed families, and 2-4X more epistatic interactions. The emergent property of fitness epistasis for traits under stabilizing selection is well-understood in evolutionary genetics, but under-appreciated in crop quantitative genetics. To highlight the importance of this phenomenon, we constructed an empirical genotype-to-fitness landscape from the sorghum NAM, a global admixed prebreeding resource, demonstrating the utility of fitness landscapes for inferring genetic compatibilities within metapopulations. Our findings suggest that in breeding networks, strategies for effective germplasm exchange must account for epistasis in the oligogenic component of the genetic architecture of locally-adapted traits. Article summaryModern public sector crop improvement happens in networks of breeding programs that routinely exchange genetic information. Traditional models for understanding quantitative traits have limited predictiveness in situations with such genetic heterogeneity. This study uses breeding simulations and empirical data to show the utility of the fitness landscape framework for characterizing the genetic architecture of complex traits in breeding metapopulations. By simulating the evolution of breeding programs and integration into networks, it demonstrates how epistatic interactions between large-effect alleles are a fundamental property that must be accounted for when exchanging germplasm. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=102 SRC="FIGDIR/small/712732v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@1541326org.highwire.dtl.DTLVardef@b553a8org.highwire.dtl.DTLVardef@8758b4org.highwire.dtl.DTLVardef@1d0bdcd_HPS_FORMAT_FIGEXP M_FIG C_FIG
Lourenco, V. M.; Ogutu, J. O.; Piepho, H.-P.
Show abstract
Data contamination--from recording errors to extreme outliers--can compromise statistical models by biasing predictions, inflating prediction errors, and, in severe cases, destabilizing performance in high-dimensional settings. Although contamination can affect responses and covariates, we focus on response contamination and evaluate Random Forests through simulation. Using a synthetic animal-breeding dataset, we assess robust Random Forests across several contamination scenarios and validate them on plant and animal datasets. We thereby clarify the consequences of contamination for prediction, develop a robust Random Forest framework, and evaluate its performance. We examine preprocessing or data-transformation strategies, algorithmic modifications, and hybrid approaches for robustifying Random Forests. Across these approaches, data transformation emerges as the most effective strategy, delivering the strongest performance under contamination. This strategy is simple, general, and transferable to other Machine Learning methods, offering a remedy for robust genomic prediction. In real breeding data, robust Random Forests are useful when substantial contamination, phenotypic corruption, misrecording, or train-deployment mismatch is plausible and the goal is to recover a latent signal for genomic prediction and selection; ranking-based robust Random Forests are the dependable first option, whereas weighting-based Random Forests should be used only when their weighting scheme preserves rank structure and improves prediction. Robustification is not universally necessary, but it becomes important when contamination distorts the link between observed responses and the predictive target; standard Random Forests remain the default for clean data, whereas robust Random Forests should be fitted alongside them whenever contamination is plausible, with the final choice guided by data, trait, and breeding objective. Author summaryMachine learning (ML) methods are widely used for prediction with high-dimensional, complex data, and supervised approaches such as Random Forests (RF) have proved effective for genomic prediction (GP) and selection. Yet their performance can be severely compromised by data contamination if the algorithms rely on classical data-driven procedures that are sensitive to atypical observations. Robustifying ML methods is therefore important both for improving predictive performance under contamination and for guiding their practical use in high-dimensional prediction problems. To address this need, we develop robust preprocessing, algorithm-level, and hybrid strategies for improving RF performance with contaminated data. Using simulated animal data, we show that ranking-and weighting-based robust RF provide the strongest overall compromise for genomic prediction and selection under contamination. Validation on several plant and animal breeding datasets further shows that the benefits of robustification are not universal, but depend on the dataset, trait, and breeding objective. Although motivated by RF, the framework we propose is general, practical, and readily transferable to other ML methods. It also offers a basis for deciding when robustness should complement standard RF rather than replace it outright.
Sato, Y.; Hamazaki, K.
Show abstract
Individual phenotypes often depend on the genotypes of other individuals within a group. These phenomena are termed indirect genetic effects (IGEs) and have been distinguished from direct genetic effects (DGEs) using quantitative genetic models. Recent studies have utilized high-resolution polymorphism data to enable genomic prediction (GP) and genome-wide association study (GWAS) of IGEs, but unified methods remain limited. Here we integrate polygenic and oligogenic IGEs using a multi-kernel mixed model incorporating two random effects with a single covariance parameter. Underlying this implementation, the Ising model of ferromagnetics enabled us to simplify locus-wise and background IGEs for GWAS and GP, respectively. Our simulations demonstrated that, while the previous and present models exhibited similar performance, the present model can infer a trade-off between DGEs and IGEs. By applying this method to three species of woody plants, we found evidence for intergenotypic competition in aspen and apple trees, but limited evidence in climbing grapevines. Based on GWAS, we also detected significant variants associated with the competitive IGEs on the apple trunk growth. Our study offers a flexible implementation for GWAS/GP of IGEs, thereby providing an effective tool to dissect the genetic architecture of group performance.
Moslemi, C.; Folgoas, M.; Yu, X.; Jensen, J. D.; Hentrup, S.; Li, T.; Wang, H.; Boelt, B.; Asp, T.; Sibout, R.; Ramstein, G. P.
Show abstract
Computational tools, including biological language models (LMs), show substantial promise in predicting the impact of genetic variants on plant fitness. However, validating variant effect predictions (VEP) requires experimental populations where genetic variation consists of discrete point mutations rather than segregating recombination blocks. In this study, we generated a novel population of Brachypodium distachyon mutant lines to evaluate the accuracy of VEP at single-base resolution. These lines were advanced through single-seed descent for five generations (M1 to M5), with whole-genome sequencing performed at M2 and M5 and phenotypic measurements recorded at M3 and M4. Using state-of-the-art VEP models, we predicted the functional impact of missense protein-coding variants and gene-proximal non-coding variants. We validated these predictions by estimating the effect of mutations on whole-plant measurements (burden tests) and their probability of fixation from M2 to M5 (purging tests). Among missense variants, the protein LM ESM showed superior predictive accuracy compared to the bioinformatic standard SIFT and the genomic LM PlantCAD. Notably, the relationship between VEP scores and allele fixation suggested a log-linear relationship between VEP scores and variant fitness. Among gene-proximal variants, PlantCAD appeared more accurate than supervised models of regulatory activity, such as chromatin accessibility (a2z) and RNA abundance (PhytoExpr). Collectively, our findings highlight the utility of state-of-the-art VEP tools as predictors of fitness and demonstrate the potential of mutant populations to evaluate computational tools for precision breeding applications.
Proma, S.; Lubanga, N.; Sacks, E.; Leakey, A. D. B.; Zhao, H.; Ghimire, B. K.; Lipka, A. E.; Njuguna, J. N.; Yu, C. Y.; Seong, E. S.; Yoo, J. H.; Nagano, H.; Anzoua, K. G.; Yamada, T.; Chebukin, P.; Jin, X.; Clark, L. V.; Petersen, K. K.; Peng, J.; Sabitov, A.; Dzyubenko, E.; Dzyubenko, N.; Glowacka, K.; Nascimento, M.; Campana Nascimento, A. C.; Dwiyanti, M. S.; Bagment, L.; Shaik, A.; Garcia-Abadillo, J.; Jarquin, D.
Show abstract
Phenotyping high-biomass perennial crops is laborious and the rate of genetic gain in perennial crop breeding programs is typically low. So, it is especially important to identify methods that produce efficiency gains in the breeding process. Miscanthus is a C4 perennial grass with favorable characteristics for producing biomass as a feedstock for biofuels and diverse biobased products. Increasing biomass yield will increase profitability and environmental benefits, so is a key target for Miscanthus breeding. In addition, the identification of well-adapted genotypes across a wide range of environmental conditions requires the establishment of multi-environment trials (METs). Sparse testing is a genomic prediction-based strategy that reduces the phenotyping costs in METs by selecting a subset of genotypes to evaluate in a subset of environments and then predicts the performance of the unobserved genotype-environment combinations. A Miscanthus sacchariflorus (MSA) population comprising 336 genotypes observed across three environments was analyzed. Three prediction models considering main effects (environments, genotypes, genomic) and interaction effects (genotype-by-environment; GxE interaction) were implemented for forecasting dry biomass yield (YDY), total culm (TCM), average internode length (AIL), and culm node number (CNN). Multiple calibration sets based on different compositions and sizes were considered to evaluate performance in terms of the predictive ability (PA) and the mean square error (MSE) for a fixed testing set size. The training set size ranged from 52 to 112 to predict a fixed set of 224 unobserved genotypes across all three environments. The results showed that the model accounting for GxE interaction presented the highest PA and the lowest MSE for CNN (PA: [~]0.77, MSE: [~]0.5) and YDY (PA: [~]0.70, MSE: [~]1.3) while for TCM and AIL these ranged from [~]0.28 to 0.41 and [~]1.3 to 4.3, respectively. Overall, varying training sets and allocation strategies did not affect PA and MSE, with 52 non-overlapping and 0 overlapping genotypes per environment as the optimal cost-effective allocation framework. This suggests that implementing sparse testing designs could significantly reduce phenotyping costs by fivefold, without compromising PA in breeding programs for perennial crops such as Miscanthus.
O'Sullivan, J.; Whittaker, C.; Xenakis, G.; Robson, T.; Perks, M.
Show abstract
Peatlands are an important terrestrial carbon sink which, when drained, can produce substantial CO2 efflux. Low productivity forestry planted on drained peatlands can become a net carbon source if losses from drained soils exceed sequestration by the trees. Decision support tools which assist resource allocation and intervention planning in forest-to-bog restoration are needed to mediate this substantial environmental harm. Predicting carbon mitigation benefits associated with forest-to-bog restoration is a major challenge, however, due to the lack of long-term monitoring programs and the fact that mitigation times depend on processes distant from the intervention. Here we introduce the PEATREST life cycle assessment (LCA) which predicts carbon fluxes associated with forest-to-bog restoration, including due to processes far from restored sites. The LCA estimates mitigation timescales defined as the time following intervention at which the restored peatland is predicted to sequester or store more carbon than the forestry would have if retained. HighlightsO_LIHere we develop a novel forest-to-bog Life cycle assessment (LCA) tool C_LIO_LIThe LCA predicts carbon mitigation times following peatland restoration C_LIO_LIThe model combines a variety of process-based and empirical sub-models C_LIO_LIExample implementations for two different restoration scenarios are explored C_LIO_LISensitivity analysis highlights the model inputs that most impact outcomes C_LI Graphical abstract(A single, concise figure that serves as a visual summary of the main research findings described in your manuscript.) O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=80 SRC="FIGDIR/small/715261v1_ufig1.gif" ALT="Figure 1"> View larger version (18K): org.highwire.dtl.DTLVardef@f243f5org.highwire.dtl.DTLVardef@14bc4c7org.highwire.dtl.DTLVardef@164261borg.highwire.dtl.DTLVardef@1db3b_HPS_FORMAT_FIGEXP M_FIG The PEATREST Life cycle assessment (LCA) generates compound time series of carbon sequestration and carbon storage for two scenarios: the forest-to-bog peatland restoration (PR) and a counterfactual (CF) of forestry retention. By comparing the two scenarios, the LCA predicts the carbon mitigation timescales (vertical dashed lines). These are defined as the time following harvesting at which the peatland is predicted to sequester more (emit less), or to have stored more (lost less) carbon, than the forestry would have if retained. C_FIG
Baudrot, V.; Kaag, M.; Charles, S.
Show abstract
Assessing the risk of pesticides to birds requires models that can extrapolate laboratory data to realistic exposure scenarios. In this work, we propose a new modeling framework BIRDkiss (Bird - Impact on Reproduction via Diet, keep it simple and suitable) that accounts for both a simplified Dynamic Energy Budget (DEBkiss) of organisms and the toxicokinetic-toxicodynamic (TKTD) of chemical substances according to a trait-based approach, thereby reducing the number of parameters to identify and strengthening the statistical robustness of the critical endpoints. The BIRDkiss model describes how food intake and toxicant exposure affect growth and egg production in birds over time. The model is fully embedded within an R package, including routines for calibration, validation and prediction under single-compound scenarios performed via Bayesian inference using standard data from the OECD avian reproduction tests. The BIRDkiss model also allows the simulations of scenarios under both varying food availability and multi-compound exposures based on the two classical mixture-toxicity paradigms: Concentration Addition (CA) and Independent Action (IA). The results of calibration for single compounds show good results matching with observed weights and egg counts. From these calibrations, predictions for new exposure scenarios can be readily generated. For mixtures, the IA algorithm is simpler and does not require to scale variables as in CA. Simulations indicate that high food levels do not further increase egg production (saturation), whereas substantial food reductions markedly decrease reproduction because energy is reallocated to maintenance. Exposure to chemicals combined to low food availability amplify the decline in reproductive output. The ready-to-use mechanistic, open-source BIRDkiss tool enables predicting the impact of pesticides on avian reproduction under realistic dietary exposure profiles. The implementation of CA and IA models is a first step toward mechanistic assessment of chemical mixtures, although validation still requires empirical mixture data. The model highlights the importance of food availability and shows that chemical stress can exacerbate the negative effects of nutritional stress. Integrating such models into regulatory frameworks could improve the ecological relevance of risk assessments. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=134 SRC="FIGDIR/small/712277v1_ufig1.gif" ALT="Figure 1"> View larger version (17K): org.highwire.dtl.DTLVardef@102246dorg.highwire.dtl.DTLVardef@1a58f65org.highwire.dtl.DTLVardef@695cd7org.highwire.dtl.DTLVardef@14e4329_HPS_FORMAT_FIGEXP M_FIG C_FIG
Poque, S.; Sandroni, M. A.; Garcia Caparros, P.; Westergaard, J. C.; Mouhu, K.; Ferdous, M.-E.-M.; Andreasson, E.; Grenville-Briggs, L. J.; Lankinen, A.; Roitsch, T.; Himanen, K. I. H.; Alexandersson, E.
Show abstract
Fitness costs of plant disease defence are often subtle and difficult to quantify. In this study, we therefore used comparative high-throughput phenotyping in two independent facilities to assess growth, morphology and physiology of potato (cv. Desiree) with high time-resolution monitoring different defence mechanisms under pathogen-free conditions. Plants were either treated weekly with the resistance inducers {beta}-aminobutyric acid (BABA; 10 mM) or potassium phosphite (KPhi; 36 mM) or comprised six transgenic lines expressing late blight resistance genes (single Rpi genes or a three-gene stack) or reduced jasmonate perception (StCOI1-RNAi). Over four weeks, image-derived traits revealed consistent cross-facility effects for plant height and colour: BABA treatment increased plant height but reduced canopy area and induced a paler greenness signature, whereas KPhi caused minimal and transient growth effects. Chlorophyll fluorescence at the NaPPI facility indicated reduced vitality (Rfd_Lss) in BABA-treated plants and increased Rfd_Lss following KPhi, while maximum PSII efficiency was largely unchanged. Several transgenic lines showed somewhat reduced above-ground biomass. Enzyme activity profiling produced distinct treatment and genotype signatures, but was strongly modulated by facility conditions that overrode these specificities. Overall, high-throughput phenotyping robustly detected subtle growth-defence trade-offs across platforms. HighlightHigh-throughput optical phenotyping validated across two independent research facilities reveals that stacked resistance genes and resistance inducers in potato trigger subtle growth trade-offs. Graphical abstracts O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=97 SRC="FIGDIR/small/713143v1_ufig1.gif" ALT="Figure 1"> View larger version (23K): org.highwire.dtl.DTLVardef@89df47org.highwire.dtl.DTLVardef@1a1ce64org.highwire.dtl.DTLVardef@1f52f0dorg.highwire.dtl.DTLVardef@1e41c35_HPS_FORMAT_FIGEXP M_FIG C_FIG Experimental timeline for high-throughput plant phenotyping platforms. Created in BioRender. Poque, S. (2026) https://BioRender.com/nmkve7g
Halpern, M.
Show abstract
BackgroundData extraction is the primary bottleneck in meta-analysis, consuming weeks of researcher time with single-extractor error rates of 17.7%. Existing LLM-based systems achieve only 26-36% accuracy on continuous outcomes, and no study has validated AI-extracted continuous data against multiple independent datasets using formal equivalence testing. MethodsA single AI agent (Claude Opus 4.6) extracted treatment means, control means, sample sizes, and variance measures from source PDFs across five published agricultural meta-analyses spanning zinc biofortification, biostimulant efficacy, biochar amendments, predator biocontrol, and elevated CO2 effects on plant mineral nutrition. Observations were matched to reference standards using an LLM-driven alignment method. Validation employed proportional TOST equivalence testing, ICC(3,1), Bland-Altman analysis, and source-type stratification. ResultsAcross five datasets, the agent produced 1,149 matched observations from 136 papers. Pearson correlations ranged from 0.984 to 0.999. Proportional TOST confirmed statistical equivalence for all five datasets (all p < 0.05). Table-sourced observations achieved 5.5x lower median error than figure-sourced observations. Aggregate effects were reproduced within 0.01-1.61 pp of published values. Independent duplicate runs confirmed extraction stability (within 0.09-0.23 pp). ConclusionsA single AI agent achieves statistical equivalence with human-extracted meta-analysis data across five independent agricultural datasets. The approach reduces extraction cost by approximately one to two orders of magnitude while maintaining accuracy sufficient for aggregate meta-analytic pooling. HighlightsO_ST_ABSWhat is already knownC_ST_ABSO_LIData extraction is the primary bottleneck in meta-analysis, with single-extractor error rates of 17.7% C_LIO_LIExisting LLM-based extraction systems achieve only 26-36% accuracy on continuous outcomes C_LIO_LINo study has validated AI extraction against multiple independent datasets using formal equivalence testing C_LI What is newO_LIA single AI agent achieves statistical equivalence with human-extracted data across five agricultural meta-analyses (1,149 observations, 136 papers) C_LIO_LILLM-driven alignment resolves the previously underappreciated bottleneck of moderator matching, improving correlations from 0.377-0.812 to 0.984-0.997 without changing extracted values C_LIO_LITable-sourced observations achieve 5.5x lower error than figure-sourced data C_LI Potential impact for RSM readersO_LIProvides a validated, reproducible workflow for AI-assisted data extraction in meta-analysis C_LIO_LIDemonstrates that most apparent "extraction error" in validation studies is actually alignment error C_LIO_LIOffers practical quality signals (source-type labeling) for downstream meta-analysts C_LI
Cerimele, G.; Kent, M.; Miller, M.; Best, R.; Franks, C.; Kakar, N.; Felderhoff, T.; Sexton-Bowser, S.; Morris, G. P.
Show abstract
Bioavailability of iron, an essential micronutrient to plants, is low in alkaline or calcareous soils, which are prevalent across semi-arid production regions. Breeding efforts to increase tolerance to iron deficiency chlorosis (IDC) in sorghum, a major crop of semi-arid regions, are confounded by spatial variation of stress severity in field trials. Here we developed and validated two high-throughput phenotyping approaches to address this challenge, with multi-spectral aerial imaging in the field and a controlled-environment assay to isolate the effects of iron bioavailability. In the field, severity and uniformity of stress are highly predictive of genetic signals for IDC tolerance (R2 > 0.6 for soil pH metrics and H2). Plot-level data filtering for stress conditions based on control genotypes successfully addresses field spatial variation (unfiltered H2 = 0.18 vs. filtered H2 = 0.4). The controlled-environment assay proxies field stress using iron sources with differential bioavailability, evidenced by high heritability ( H2 = 0.98) and phenotypic differential for hybrid control genotypes that matches field performance. Finally, we show that assay phenotypes are suitable for genome-wide association studies in global germplasm. Together, these field and lab phenomic approaches can be deployed to understand genetics of IDC tolerance and develop crops resilient to alkaline soils. HIGHLIGHTStress severity and uniformity greatly impact detection of genetic signals underlying iron deficiency chlorosis tolerance in sorghum. A controlled-environment assay reduces spatial heterogeneity and improves assessment of tolerance genetics.
Sattler, M. C.; Singh, A.; Bass, H. W.; Mondin, M.
Show abstract
BackgroundMaize knobs are regions of constitutive heterochromatin that are readily identified in both meiotic and somatic chromosomes. These structures have been characterized as stable throughout the cell cycle, exhibiting late replication during the S-phase, and are composed of two specific families of highly repetitive DNA sequences: K180 and TR-1. Although widely used as cytogenetic markers due to their variability in number and chromosomal position across inbred lines, hybrids, and landraces, little is known about their chromatin structure and dynamics. In this study, we analyzed chromatin accessibility of knobs using DNS-seq data across four maize tissues representing distinct developmental stages. ResultsOur results reveal that K180 knobs exhibit tissue-specific variation in chromatin accessibility, transitioning between open and closed states during development. In contrast, the TR-1 knob of chromosome 4 remained consistently inaccessible across all tissues analyzed. A knob composed of both K180, and TR-1 further supported this observation, with only the K180 region showing dynamic accessibility. To validate these findings, we also analyzed other repetitive regions such as centromeres, which showed a uniformly closed chromatin structure similar to TR-1. These results suggest a unique developmental modulation of chromatin accessibility associated with K180 repeats. While the chromatin accessibility of knobs does not reach the levels observed at Transcription Start Sites (TSS), the comparison among different classes of repetitive DNA within maize constitutive heterochromatin provides compelling evidence for sequence-specific and tissue-specific chromatin dynamics. ConclusionsOur findings uncover a previously unrecognized property of maize knobs and establish a reference for future studies on chromatin organization and epigenetic regulation of repetitive DNA in plant genomes.
L. Rocha, H.; Bucher, E.; Zhang, S.; Deshpande, A.; Bergman, D. R.; Heiland, R.; Macklin, P. R.
Show abstract
Agent-based models (ABMs) are widely used to study complex multiscale biological systems, particularly in cancer research. However, their high-dimensional parameter spaces, stochasticity, and computational costs pose significant challenges for uncertainty quantification, calibration, and systematic comparison of competing mechanistic hypotheses. PhysiCell has evolved into a growing ecosystem of open-source tools supporting physics-based multicellular modeling, including model construction, visualization, and data integration. However, despite these advances, systematic support for uncertainty-aware model analysis, scalable parameter exploration, and formal calibration workflows remains limited. Here, we introduce UQ-PhysiCell, an open-source Python package that enables uncertainty quantification, calibration, and model selection for PhysiCell models using a modular and scalable workflow. UQ-PhysiCell acts as a manager of PhysiCell simulation inputs and outputs, including parameters, initial conditions, rules, and MultiCellDS-compliant objects, and provides automated orchestration of large ensembles of simulations. The framework supports multiple levels of parallelism to accelerate the analysis, including the parallel execution of independent simulations, stochastic replicates, and downstream analysis tasks. UQ-PhysiCell integrates seamlessly with established Python libraries for sensitivity analysis, optimization, Bayesian inference, and surrogate modeling, allowing users to construct customized pipelines that match their modeling goals and computational resource requirements. By decoupling model execution from statistical analysis and emphasizing extensibility and reproducibility, UQ-PhysiCell lowers the barrier to applying rigorous uncertainty-aware methodologies to agent-based modeling and supports the systematic evaluation of PhysiCell models in biological and biomedical research. Author summaryWe developed UQ-PhysiCell to address a key challenge in agent-based modeling: the systematic quantification of uncertainty in complex stochastic simulations. PhysiCell is widely used to model multicellular biological systems, particularly in cancer research; however, practical tools for uncertainty analysis, calibration, and model comparison are often developed in an ad hoc manner. This makes the results difficult to reproduce and limits the ability to rigorously evaluate competing biological hypotheses. UQ-PhysiCell provides a flexible Python framework that manages the inputs and outputs of PhysiCell simulations and enables large-scale computational analysis. We designed the software to be modular, allowing users to build their own analysis pipelines and combine different methodologies for sensitivity analysis, calibration, and model selection. Rather than enforcing a single workflow, UQ-PhysiCell supports customization to match specific scientific questions and computational requirements. To make uncertainty-aware analyses feasible for computationally intensive agent-based models, UQ-PhysiCell implements multiple parallelism strategies, enabling the concurrent execution of simulations, stochastic replicates, and downstream analyses. By promoting reproducibility, scalability, and methodological flexibility, UQ-PhysiCell helps researchers move beyond single best-fit simulations toward more reliable and interpretable computational modeling.
Halpin-McCormick, A.; Nalla, M. K.; Radlicz, Z.; Zhang, A.; Fumia, N.; Lin, T.-h.; Lin, S.-w.; Wang, Y.-w.; Zohoungbogbo, H. P. F.; Wang, D. R.; Runck, B.; Gore, M. A.; Kantar, M. B.; Barchenger, D. W.
Show abstract
Climate change increasingly threatens global Capsicum (pepper) production. Accelerating the deployment of climate-resilient cultivars requires effective use of genetic diversity conserved in genebanks. We implement a "turbocharging" strategy in Capsicum by integrating genome-wide association studies and genomic prediction in a core collection (n = 423), followed by genomic prediction across the global collection (n = 10,250) using the core as a training population. We generated genomic estimated breeding values (GEBVs) for 31 high-accuracy traits (r > 0.5) encompassing hyperspectral phenotypes (heat/control), agronomic performance (heat/control) and fruit quality. To enhance accessibility and decision-making, we developed a large language model (LLM) integrated application that enables flexible, preference-based selection of candidates. By narrowing the parental decision space, this framework streamlines screening of large germplasm collections while balancing climate resilience, quality attributes and market demands. Our approach provides a scalable decision-support system to accelerate climate-resilient Capsicum breeding and maximize global genetic resources.
Robles-Zazueta, C. A.; Strack, T.; Schmidt, M.; Callipo, P.; Robinson, H.; Vasudevan, A.; Voss-Fels, K.
Show abstract
Grapevine cluster architecture is a key selection target in breeding programs because it influences disease susceptibility, yield stability and juice quality. High-throughput phenotyping offers a rapid and non-destructive approach to capture biochemical and structural variation in these traits, yet the influence of plant organ reflectance and data partitioning strategies on trait prediction remains poorly understood. In this study, we evaluated how hyperspectral reflectance from different grapevine organs contributes to the prediction of cluster architecture and juice quality traits in two clonal populations of Riesling and Pinot. Using partial least squares regression (PLSR), we assessed the prediction accuracy of eight cluster architecture and six juice quality traits under two data partitioning strategies. Models based on cluster reflectance outperformed those using dry leaf reflectance for most traits, except for pH. Partitioning the dataset by cluster type increased trait variance and improved predictions for number of berries (R{superscript 2} = 0.53), berry diameter (R{superscript 2} = 0.79), and total acidity (R{superscript 2} = 0.48). Visible, red-edge and NIR spectra were most informative regions to predict the traits studied. Together, our results highlight the importance of organ-specific data and appropriate calibration strategies to improve phenomic models for the development of scalable proxies for grapevine improvement. HighlightSpectral phenomics reveals that prediction accuracy in grapevine depends on organ spectral signatures and traits, with cluster reflectance outperforming leaves, informing new phenotyping strategies for breeding improvement.
Enyew, M.; Studer, A. J.; Woodford, R.; Ermakova, M.; von Caemmerer, S.; Cousins, A. B.
Show abstract
Understanding the regulation of enzyme activity involved in photosynthesis is essential for engineering enhanced carbon fixation in crops. In C4 plants, the enzyme phosphoenolpyruvate carboxylase (PEPC, EC 4.1.1.31) is one of the most abundant leaf enzymes and plays an essential role in photosynthetic carbon dioxide (CO2) fixation. The enzyme also plays a key role in central metabolism (e.g., providing intermediates to the citric acid cycle) and therefore must be highly regulated to coordinate its activity. The regulation of PEPC activity can occur allosterically by glucose 6-phosphate activation and malate inhibition, which is in part influenced by reversible phosphorylation. A specific light-dependent phosphorylation of PEPC at an N-terminal serine residue by the PEPC-protein kinase (PEPC-PK) can regulate its sensitivity to this allosteric regulation. However, the impact of this PEPC phosphorylation has not been tested in a C4 crop. Therefore, we created PEPC-PK mutant lines in Zea mays to assess the impact of PEPC phosphorylation on its allosteric regulation, photosynthesis, and growth. While the maximum PEPC activity was unchanged, PEPC in the PEPC-PK mutant plants was not phosphorylated under light and was more sensitive to malate inhibition. However, gas exchange, electron transport, and field biomass analyses showed no differences in the PEPC-PK mutant plants. These results demonstrate that in Z. mays PEPC phosphorylation affects enzyme sensitivity to malate in vitro but does not substantially alert photosynthetic performance or growth under field conditions suggesting additional regulation of PEPC activity in planta.
Gregoire, M.; Pateyron, S.; Brunaud, V.; Tamby, J. P.; Benghelima, L.; Martin, M.-L.; Girin, T.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWNitrogen fertilizers are essential for crop productivity but cause environmental harm, necessitating the development of cultivars that thrive under limited nitrogen. This study investigates the transcriptomic response to nitrate in Arabidopsis thaliana (a model dicot), Brachypodium distachyon (a model Pooideae), and Hordeum vulgare (barley, a domesticated Pooideae) to identify conserved and species-specific molecular mechanisms. Using RNA-seq after 1.5 and 3 hours of nitrate treatment, we found that core nitrate-responsive biological processes - such as nitrate transport, assimilation, carbon metabolism, and hormone signaling - are largely conserved across species. However, comparative analysis at gene level based on orthology revealed specificities between the species. For instance, rRNA processing was uniquely stimulated in Arabidopsis, while cysteine biosynthesis from serine and gibberellin biosynthesis were specifically regulated in Brachypodium and barley. Orthologs of key nitrate-responsive genes (e.g., NRT, NLP, TCP20) exhibited variable regulation, reflecting potential adaptations linked to domestication or nutrient acquisition strategies. These findings highlight the importance of integrating model and crop species to uncover targets for improving nitrogen use efficiency in cereals. The study provides a pipeline integrating gene ontology and orthology analyses to compare transcriptomic responses between species.
Vega, A. G.; Bennett, N. E.; Beadle, E. P.; Alshafeay, S.; Chitturi, R.; Nagarimadugu, A.; Villur, H.; Jaiswal, A.; Rhoades, J. A.; Harris, L. A.
Show abstract
Tumor-induced bone disease (TIBD) arises from a complex interplay between metastatic cancer cells and the bone microenvironment, creating a self-reinforcing "vicious cycle" of bone destruction and tumor growth. Experimental evidence from our group (Buenrostro et al., Bone 113:77-88, 2018) suggests that tumor cells in the bone microenvironment early in disease rely more heavily on bone-derived growth factors, such as transforming growth factor-{beta} (TGF-{beta}), to sustain proliferation than tumor cells late in disease, which may grow independently of these factors. Here, we integrate a mechanistic, population-dynamics model of tumor-bone interactions with in vivo data to test the hypothesis that inhibiting bone resorption suppresses growth of non-adapted but not bone-adapted tumors. The model includes key regulators of TIBD, including TGF-{beta}-driven tumor proliferation, parathyroid hormone-related protein (PTHrP) secretion, and osteoblast (OB)-osteoclast (OC) coupling. Parameter calibration using data from mice injected intratibially with parental (non-adapted) and bone-adapted breast cancer cells reveals distinct parameter values for each tumor type. Bone-adapted cells exhibit a higher basal division rate and reduced sensitivity to TGF-{beta}-mediated stimulation, whereas parental-derived tumor cells depend more strongly on TGF-{beta} and secrete PTHrP at higher rates to compensate for their slower growth. Model simulations reproduce the greater bone loss observed experimentally for bone-adapted tumors and predict that, for non-adapted tumors, bone destruction results from a slower but meaningful rise in OC activity and a possible moderate decline in OBs. Simulated treatment of bone-adapted tumors with the bisphosphonate zoledronic acid stabilizes bone density but has limited or highly variable effects on tumor growth. These results suggest that OC inhibition alone may be insufficient to restrain tumor expansion once tumors have adapted to the bone microenvironment. Together, these findings support the hypothesis that tumor adaptation to the bone microenvironment governs dependence on bone-derived growth factors and response to OC-targeted therapy, underscoring the value of mechanistic modeling for elucidating tumor-bone interactions and guiding tumor-type-specific treatment strategies for TIBD.